home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group99a.txt
/
000129_icon-group-sender _Tue Jun 8 08:27:08 1999.msg
< prev
next >
Wrap
Internet Message Format
|
2000-09-20
|
2KB
Return-Path: <icon-group-sender>
Received: (from root@localhost)
by baskerville.CS.Arizona.EDU (8.9.1a/8.9.1) id IAA08145
for icon-group-addresses; Tue, 8 Jun 1999 08:26:59 -0700 (MST)
Message-Id: <199906081526.IAA08145@baskerville.CS.Arizona.EDU>
Delivered-To: icon-group@silliac.cs.arizona.edu
To: icon-group@optima.CS.Arizona.EDU
Date: Tue, 08 Jun 1999 00:23:56 GMT
From: cbbrowne@news.isp.giganews.com (Christopher Browne)
Subject: Re: Find the longest matching substrings
Errors-To: icon-group-errors@optima.CS.Arizona.EDU
Status: RO
On 4 Jun 1999 11:05:40 -0400, gep2@terabites.com <gep2@terabites.com> wrote:
>If I were going to try to make something like this FAST (and especially if one
>of the strings were relatively constant) I'd probably want to use a hash (or
>even direct index?) table pre-computed with the offsets of character substrings
>(I'd probably use two-character adjacent pairs) in one of those strings; this
>would enable stepping rapidly through the second string and quickly finding at
>least (only just the) plausible candidates for launching more detailed substring
>compare (and length computation) operations. This kind of thing would likely be
>storage-hungry, but that's less of a concern today than it once was.
Alternatively, treat the main string as a semi-infinite string
(si-string), and insert pointers to each successive character of the
si-string into a Patricia tree. Then do a tree walk to find the longest
matches that were found.
--
"Unlike computers, guns don't have Y2K problems..."
cbbrowne@hex.net- <http://www.hex.net/~cbbrowne/computing.html>